A Distribution-Based Clustering Algorithm for Mining in Large Spatial Databases
نویسندگان
چکیده
The problem of detecting clusters of points belonging to a spatial point process arises in many applications. In this paper , we introduce the new clustering algorithm DBCLASD (Distribution Based Clustering of LArge Spatial Databases) to discover clusters of this type. The results of experiments demonstrate that DBCLASD, contrary to partitioning algorithms such as CLARANS, discovers clusters of arbitrary shape. Furthermore, DBCLASD does not require any input parameters, in contrast to the clustering algorithm DBSCAN requiring two input parameters which may be difficult to provide for large databases. In terms of efficiency, DBCLASD is between CLARANS and DBSCAN, close to DBSCAN. Thus, the efficiency of DBCLASD on large spatial databases is very attractive when considering its nonparametric nature and its good quality for clusters of arbitrary shape.
منابع مشابه
A Novel Spatial Clustering Algorithm Based on Delaunay Triangulation
Exploratory data analysis is increasingly more necessary as larger spatial data is managed in electro-magnetic media. Spatial clustering is one of the very important spatial data mining techniques which is the discovery of interesting relationships and characteristics that may exist implicitly in spatial databases. So far, a lot of spatial clustering algorithms have been proposed in many applic...
متن کاملExtracting Prior Knowledge from Data Distribution to Migrate from Blind to Semi-Supervised Clustering
Although many studies have been conducted to improve the clustering efficiency, most of the state-of-art schemes suffer from the lack of robustness and stability. This paper is aimed at proposing an efficient approach to elicit prior knowledge in terms of must-link and cannot-link from the estimated distribution of raw data in order to convert a blind clustering problem into a semi-supervised o...
متن کاملSolving Data Clustering Problems using Chaos Embedded Cat Swarm Optimization
In this paper, a new method is proposed for solving the data clustering problem using Cat Swarm Optimization (CSO) algorithm based on chaotic behavior. The problem of data clustering is an important section in the field of the data mining, which has always been noted by researchers and experts in data mining for its numerous applications in solving real-world problems. The CSO algorithm is one ...
متن کاملMining the Banking Customer Behavior Using Clustering and Association Rules Methods
The unprecedented growth of competition in the banking technology has raised the importance of retaining current customers and acquires new customers so that is important analyzing Customer behavior, which is base on bank databases. Analyzing bank databases for analyzing customer behavior is difficult since bank databases are multi-dimensional, comprised of monthly account records and daily t...
متن کاملSolving Data Clustering Problems using Chaos Embedded Cat Swarm Optimization
In this paper, a new method is proposed for solving the data clustering problem using Cat Swarm Optimization (CSO) algorithm based on chaotic behavior. The problem of data clustering is an important section in the field of the data mining, which has always been noted by researchers and experts in data mining for its numerous applications in solving real-world problems. The CSO algorithm is one ...
متن کامل